One of the properties in cassandra.yaml is seeds. Its purpose is somewhat similar to the agent's failover list. It is a list of addresses or hostnames that Cassandra uses at start up to discover other nodes and learn the ring topology. When the topology changes due to nodes being added to or removed from the cluster, it might be necessary to update nodes' seeds list. This document describes identifies different scenarios and describes how to manage the seeds list for each of them.
In this scenario the user is going through a new RHQ install that will include a single storage node and a single server on the same machine. The user kicks off the installation by executing rhqctl install. See RHQ Control Script for more details on the rhqctl script.
Run storage node installer for node N1
Storage installer sets the seeds property in cassandra.yaml
The value only consists of the hostname/address of N1
Storage installer updates the rhq.cassandra.seeds property in rhq-server.properties
Storage node is started
Run server installer runs
Server installer applies schema to storage node
Server is started
Run agent installer
Agent is started
Agent sends inventory report to server that includes N1
Upon import the server adds a row in the rhq_storage_node table for N1
Note that each service is started upon installed.
This scenario is an extension of scenario 1. Assume that there is already a storage node N1 in inventory, and it also has a row in the rhq_storage_node table. The user installs the server by executing rhqctl install --server.
Run server installer
Server installer obtains storage node connection params by querying the rhq_storage_node table
Server installer applies schema changes to N1
No changes are made since schema was already applied in scenario 1.
Because no storage node is installed the rhq.cassandra.seeds property does not get set. The server installer will instead get the storage node connection parameters by querying the rhq_storage_node table.
This scenario is an extension of scenario 1. It could also be considered an extension of scenario 2. The user installs a second storage node by executing rhqctl install --storage.
Storage installer runs for node N2
Storage installer queries the rhq_storage_node table for up to date seeds list
Storage installer sets the seeds property in cassandra.yaml
The value consists of the addresses for N1 and N2
N2 is started
Agent installer runs
Agent is started
Agent sends inventory report to server that includes N2
Upon import the server does the following
Adds a row in the rhq_storage_node table for N2
Schedule an operation with the agent for N1 to update its seeds list so that it includes N2
The storage installer still updates rhq-server.properties in this scenario, but it was not mentioned because it is not relevant since no server is being installed. The storage installer queries the rhq_storage_table so that N2 has an up to date seeds list. Without this, N2 will not be able to form a cluster with N1. At this point N2 has an up to date seeds list, but N1 does not. This is ok. They will still be able to form a cluster. The key thing with respect to N1 is that it has an updated seeds list before its next restart. And we see this happen in step 8b where this server schedules an operation to update N1.
This is an extension of scenario 3. We already have storage nodes N1 and N2. The user installs a third storage node by executing rhqctl install --storage.
Storage installer runs for node N3
Storage installer queries the rhq_storage_node table for up to date seeds list
Storage installer sets the seeds property in cassandra.yaml
The value consists of the addresses for N1, N2, and N3
N3 is started
Agent installer runs
Agent is started
Agent sends discovery report to server that includes N3
Upon import the server does the following
Adds a row in the rhq_storage_node table for N3
Schedule an operation with the agent for N1 and N2 to update their seeds list so that it includes N3
The only distinction between this and scenario 3 is that here in step 8b we update multiple nodes. Even though N1 and N2 initially do not have N3 in their seeds list, N3 still still be able join the cluster since it does have an up to date list at start up.
The user is going through an initial install of RHQ and decides up front to deploy multiple storage nodes.
Storage installer runs for node N1
Specify seeds list as N1, N2, N3
Agent installer runs
Storage installer runs for node N2
Specify seeds list as N1, N2, N3
Agent installer runs
Storage installer runs for node N3
Specify seeds list as N1, N2, N3
Agent installer runs
Server installer runs (on separate machine from storage nodes
Prior to running the installer, the user manually sets the rhq.cassandra.seeds property in rhq-server.properties
Server installer applies schema to storage node cluster
Server is started
Agents sends inventory reports to server that include N1, N2, and N3
Upon import, server adds rows in rhq_storage_node table for N1, N2, and N3
Unlike prior scenarios where nodes are added incrementally, the storage nodes are installed up front. The user specifies the seeds list for each node during the install. The user has to do this since the RHQ server is not yet installed. Since the user knows in advance of the deployment that he will deploy multiple storage nodes, it is better to install the nodes up front because it reduces the maintenance operations involved. See Adding Storage Nodes for details on the tasks involved when new nodes are added to the cluster.